Skip to content

HDDS-14940. Create skeleton for ozone admin upgrade status command#10011

Open
sodonnel wants to merge 2 commits intoapache:HDDS-14496-zdufrom
sodonnel:HDDS-14940
Open

HDDS-14940. Create skeleton for ozone admin upgrade status command#10011
sodonnel wants to merge 2 commits intoapache:HDDS-14496-zdufrom
sodonnel:HDDS-14940

Conversation

@sodonnel
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Create the skeleton for a new upgrade command ozone admin upgrade status, which will eventually replace the existing commands. This initial version will connect to SCM and pull a placeholder response. The logic to actually query the SCM status will be added in followup PRs.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14940

How was this patch tested?

Tested manually in docker to prove the command is wired up correctly. Tests can be added when we have real functionality in the command to test.

@github-actions github-actions bot added the zdu Pull requests for Zero Downtime Upgrade (ZDU) https://issues.apache.org/jira/browse/HDDS-14496 label Mar 31, 2026
Copy link
Copy Markdown
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sodonnel for the patch.

Comment on lines +430 to +433
required bool scmFinalized = 1;
required int32 numDatanodesFinalized = 2;
required int32 numDatanodesTotal = 3;
required bool shouldFinalize = 4;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new object - shouldn't all its fields be required as we have no backward compatibility issue here - nothing else can be using this object?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about backward compatibility of a future version of the software. From the docs I linked:

required must not be used for new fields. Semantics for required fields should be implemented at the application layer instead

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - this probably should be communicated more widely, as it seems that the established pattern is now not to be followed, and I suspect many people don't know this. The current definition file is also full of required fields.

Comment on lines +45 to +49
System.out.println("Update status:");
System.out.println(" SCM Finalized: " + status.getScmFinalized());
System.out.println(" Datanodes finalized: " + status.getNumDatanodesFinalized());
System.out.println(" Total Datanodes: " + status.getNumDatanodesTotal());
System.out.println(" Should Finalize: " + status.getShouldFinalize());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please use out() (inherited from AbstractSubcommand) instead of System.out.

*/
@CommandLine.Command(
name = "status",
description = "Upgrade status of the cluster",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
description = "Upgrade status of the cluster",
description = "Show status of cluster upgrade",

*/
@CommandLine.Command(
name = "upgrade",
description = "Upgrade specific operations",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
description = "Upgrade specific operations",
description = "Operations related to cluster upgrade",

@errose28 errose28 requested review from dombizita and errose28 April 7, 2026 13:09
Copy link
Copy Markdown
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if we remove the parameters from queryUpgradeStatus since they won't be required in the new model.

String upgradeClientID, boolean force, boolean readonly)
throws IOException;

HddsProtos.UpgradeStatus queryUpgradeStatus(String upgradeClientID, boolean readonly) throws IOException;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need client ID and readonly parameters for the new API. Finalization will not be tracking message state per client now that is one atomic operation per component.

Comment on lines +44 to +49
// Temporary output to validate the command is working.
out().println("Update status:");
out().println(" SCM Finalized: " + status.getScmFinalized());
out().println(" Datanodes finalized: " + status.getNumDatanodesFinalized());
out().println(" Total Datanodes: " + status.getNumDatanodesTotal());
out().println(" Should Finalize: " + status.getShouldFinalize());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now. I'm thinking the final output would have a "human" version:

OM is not finalized
SCM is finalized
3/10 Datanodes are finalized

and then a --json version for scripting:

{
  "omFinalized": false,
  "scmFinalized": true,
  "finalizedDatanodes": 3,
  "totalDatanodes": 10
}

Note that the final version of this command will also query OM for its finalization status.

We also don't need to print the shouldFinalize parameter. That is for SCM to explicitly tell OM when it should finalize so OM does not need to infer this from datanode counts.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote +1 on the human and JSON versions.

If we don't print the shouldFinalize parameter though (and probably even if we do), it would make a lot of sense to log it when it changes. That would be useful for troubleshooting.

Copy link
Copy Markdown
Contributor

@dombizita dombizita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, after addressing the comments by @errose28. Tested it and the command is returning the placeholder response.

@octachoron
Copy link
Copy Markdown
Contributor

Same here - the change lines up with what we talked about, it prints the expected content, and the new subcommands appear in the CLI help. I had one thought inline about printing shouldFinalize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

zdu Pull requests for Zero Downtime Upgrade (ZDU) https://issues.apache.org/jira/browse/HDDS-14496

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants